Learning GP-trees from Noisy Data

نویسندگان

  • José L. Montaña
  • César L. Alonso
  • Cruz E. Borges
چکیده

We discuss the problem of model selection in Genetic Programming using the framework provided by Statistical Learning Theory, i.e. Vapnik-Chervonenkis theory (VC). We present empirical comparisons between classical statistical methods (AIC, BIC) for model selection and the Structural Risk Minimization method (based on VC-theory) for symbolic regression problems. Empirical comparisons of different methods for model selection suggest practical advantages of using VC-based model selection when using genetic training. keywords: Model selection, genetic programming, symbolic regression

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Penalty Functions for Genetic Programming Algorithms

Very often symbolic regression, as addressed in Genetic Programming (GP), is equivalent to approximate interpolation. This means that, in general, GP algorithms try to fit the sample as better as possible but no notion of generalization error is considered. As a consequence, overfitting, code-bloat and noisy data are problems which are not satisfactorily solved under this approach. Motivated by...

متن کامل

Application of Genetic Programming toInduction of Linear Classi cation

A common problem in datamining is to nd accurate classi-ers for a dataset. For this purpose, genetic programming (GP) is applied to a set of benchmark classiication problems. Using GP, we are able to induce decision trees with a linear combination of variables in each function node. A new representation of decision trees using Strong Typing in GP is introduced. With this representation it is no...

متن کامل

Learning from Multiple Annotators with Gaussian Processes

In many supervised learning tasks it can be costly or infeasible to obtain objective, reliable labels. We may, however, be able to obtain a large number of subjective, possibly noisy, labels from multiple annotators. Typically, annotators have different levels of expertise (i.e., novice, expert) and there is considerable diagreement among annotators. We present a Gaussian process (GP) approach ...

متن کامل

Improving Induction of Linear Classification Trees with Genetic Programming

A common problem in datamining is to nd accurate classiiers for a dataset. For this purpose, genetic programming (GP) is applied to a set of benchmark classiication problems. Using GP, we are able to induce decision trees with a linear combination of variables in each function node. A new representation of decision trees using Strong Typing in GP was introduced in Bot and Langdon, 2000]. The ee...

متن کامل

Application of Genetic Programming to Induction of Linear Classi cation Trees

A common problem in datamining is to nd accurate classiiers for a dataset. For this purpose, genetic programming (GP) is applied to a benchmark of classiication problems. In particular, using GP we are able to induce decision trees with a linear combination of variables in each function node. The eeects of techniques as limited error tness, tness sharing Pareto scoring and domination Pareto sco...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009